feat: verda cloud (gpu/ai) provider support by rafeegnash · Pull Request #142 · bgdnvk/clanker

rafeegnash · 2026-04-19T20:40:58Z

Summary

Adds Verda Cloud (ex-DataCrunch) as a first-class clanker provider, mirroring the shape of the existing cf/do/hetzner/vercel integrations.
New clanker verda command tree (list/get/action/balance + verda ask) plus clanker ask --verda with keyword routing.
Adds verda-instant Kubernetes cluster provider so clanker k8s can provision and pull kubeconfigs from Verda Instant Clusters.
Exposes clanker_verda_ask / clanker_verda_list over MCP.

Details

OAuth2 Client Credentials flow against https://api.verda.com/v1, with expires_in-driven refresh, 429 / Retry-After, 207 multi-status and {code,message} error decoding.
Credential resolution order matches other providers: ~/.clanker.yaml → VERDA_* env → ~/.verda/credentials (written by verda auth login).
Pulls kubeconfig off an Instant Cluster's head node via SSH and rewrites the server: URL to the public IP.
Unit tests cover token caching, 429 retry, multi-status decode, and credentials-file parsing (YAML + flat).

Test plan

make fmt vet test-short build
./bin/clanker verda --help shows the new tree
./bin/clanker ask --help lists --verda
./bin/clanker verda list instances against a real Verda account
./bin/clanker verda ask "what's my balance and running GPUs?" against a real account
./bin/clanker mcp --transport http --listen :39393 exposes clanker_verda_*

- release mutex during oauth token fetch to avoid deadlock risk - honor context cancellation during retry backoffs - proactively bound polling sleep to remaining deadline - drop `offline` from terminal instance status so transient stops don't end polling early - atomic rename on conversation history save so a crash mid-write can't corrupt it

- pass remote path to ssh as a separate argv token and reject shell metacharacters so a dynamic candidate list cannot inject commands - add BatchMode=yes to ssh so kubeconfig reads fail fast on missing keys instead of prompting for a password - lowercase uuid inputs before short-circuiting hostname lookup so pasted uppercase ids resolve correctly

- resolve hostname to instance id in verda action so users can pass names instead of uuids - add container-deployments, job-deployments, container-types, secrets, file-secrets, registry-creds, and balance to verda list - gather container and job deployment context for ask-mode prompts - propagate cmd.Context to handleVerdaQuery so ctrl-c cancels in-flight api calls - check json.Marshal error on action payload instead of discarding it

- uuid parsing, scp target splitting, kubeconfig server rewrite, and the nil-client guard now have coverage in internal/k8s/cluster - ResolveInstanceID verifies hostname lookup, uuid short-circuit, uppercase normalisation, and unknown-name error path - sleepCtx exercises both cancellation and zero-duration paths - routing tests confirm verda keyword hits, datacrunch alias, default-provider fallback, no-false-positive on bare 'gpu', and LLM classification clearing the right providers - tighten looksLikeUUID to lowercase-only since resolveClusterID already lowercases, matching the verda package mirror

cloudflare, k8s, gcp, azure, digitalocean, hetzner, aws, and iam branches all leaked ctx.Verda through when the llm picked a different provider. a keyword-inferred verda signal paired with an llm-picked aws query would surface both flags and downstream code would run both paths. add a parametric regression test so every sibling provider gets checked.

- mcp `clanker_verda_list` switch gains containers, jobs, secrets, file-secrets, and registry-creds — matches the cli surface added in the previous commit - error on unknown resource now lists every supported value so users don't need to open docs - `clanker verda balance` prints a human-readable `Balance: $42.17 USD` line in addition to the json body, skipped with --raw for scriptability - mcp input schema description updated so agents that read the schema know the full enum

pickKubernetesClusterImage used to fall back silently to the default cluster image when no image carried a kubernetes/k8s hint. the cluster would then provision without k8s installed, and GetKubeconfig would later fail opaquely looking for /root/.kube/config. return the label-match as a second boolean so Create can emit a visible warning with the image id in the log, giving the caller a chance to abort or inject a startup script.

previously any path that reached RunVerdaCLI* surfaced a generic exec failure like "exec: not found". with this change: - new typed sentinel ErrCLINotInstalled + IsCLINotInstalled helper so callers can branch without string matching. the error message now points at the docs + brew install + REST fallback rather than a bare "not found" - CLIInstalled() preflight lets callers show cli-aware ux eagerly instead of waiting for a real exec error - resolveVerdaCredentials suggests `verda auth login` only when the binary is actually installed; otherwise it drops that line so the user isn't steered toward a command they can't run - k8s agent auto-registration now logs a specific reason when the provider is skipped (no creds / partial creds / NewClient failed) at debug level so users investigating "verda-instant missing from cluster types" get a breadcrumb tests cover ErrCLINotInstalled returning from a stubbed empty PATH, the wrapped error message mentioning docs.verda.com and the REST fallback, and CLIInstalled returning false on an empty PATH.

two concurrent `clanker ask --verda` invocations for the same project could race on the tmp-file + rename, dropping one conversation's history without returning an error. add a package-level map of per-scope sync.Mutex so both Save and Load take the same lock before touching the file — write barrier first, then the existing struct RWMutex for in-memory state. tests cover the save/load round trip, a 20-goroutine storm against the same scope that asserts the final file parses as valid json, and a leak check that confirms only the final file remains (no stray verda_*.tmp). go test -race is clean.

matches the pattern resolveHetznerToken / resolveVercelToken already use. when local resolution (config → env → ~/.verda/credentials) comes up empty and a clanker backend api key is configured, we now try the backend credential store. a 404 from the backend (because the server-side route may not be provisioned yet) gracefully falls through to the existing human-readable "not configured" error. - new VerdaCredentials type + ProviderVerda constant in internal/backend/types.go - GetVerdaCredentials + StoreVerdaCredentials client methods following the existing Vercel/Hetzner shape so `clanker credentials store verda ...` can be added server-side later without client updates - resolveVerdaCredentialsWithContext keeps the sync-safe resolveVerdaCredentials wrapper for non-ctx callers, but handleVerdaQuery now uses the ctx variant so cancellation during the backend round-trip is honoured

the ask-mode backend fallback we just added reads from the clanker backend credential store, but users had no way to push verda creds to it locally. this closes that loop. - storeVerdaCredentials reads client-id/secret/project-id from --flag, then verda.Resolve* chain (viper → env → ~/.verda/credentials) and uploads to PUT /api/v1/cli/credentials/verda via the new backend client method - --client-id / --client-secret / --project-id flags added to the store command with the other provider flags - help text and usage examples updated to include verda - friendly error when both credential fields are missing points at every configuration path (flags, yaml, env, `verda auth login`)

`clanker credentials test verda` pulls the stored client_id/secret from the clanker backend, spins up a verda.Client, and hits /v1/balance as the cheapest authenticated probe. debug mode prints the returned balance. `clanker credentials delete verda` was accepting "verda" as a provider string via the store path but never reached delete — routes ProviderVerda now in runCredentialsDelete.

VerdaPlanPromptWithMode instructs the llm to emit a rest-first plan: args are [verda-api, METHOD, /v1/path, body?] instead of a shell command. this keeps maker execution independent of the verda cli (which may not be installed) and reuses the well-understood verda oauth2 + retry client. prompt covers the verda api's most common flows: list, create instance, lifecycle actions (start/shutdown/delete/hibernate), create volume, attach volume, create ssh key, create startup script, create instant cluster with kubernetes image, discontinue cluster, check balance, and enumerate instance-types. documents the binding format so the planner can chain commands (INSTANCE_ID -> next command).

ExecuteVerdaPlan dispatches [verda-api, METHOD, /v1/path, body?] commands directly through verda.Client — no shell-out, no cli dependency. the existing oauth2 token caching, 429 backoff, and typed error decoding all apply unchanged. plan bindings (<PLACEHOLDER>) and `produces` jsonpath capture are wired through applyPlanBindings + learnPlanBindingsFromProduces so multi-step plans (create instance -> start instance) compose cleanly. validateVerdaCommand enforces: - exactly 3-4 args [verda-api, METHOD, path, body?] - method in GET/POST/PUT/PATCH/DELETE - path prefix /v1/ - no newlines in any arg - destructive operations (DELETE, action=delete|discontinue|force_shutdown| delete_stuck|hibernate) gated behind --destroyer ExecOptions gains VerdaClientID/VerdaClientSecret/VerdaProjectID fields. internal/verda exposes SetBaseURLForTest so the maker's test file can redirect the executor at an httptest server without touching production. tests cover shape validation, destructive gating (delete/discontinue pass only with --destroyer; start passes always), and end-to-end execution with a real httptest server — verifying oauth2 cache reuse (one token call across two api calls) and jsonpath binding substitution from one command into the next.

ask.go now routes --maker --verda through the full plan + apply cycle: - drop the hard-coded "not yet supported for verda" guard - explicitVerda selects makerProvider="verda" (explicit reason) - svcCtx.Verda inference selects makerProvider="verda" (inferred reason) - maker prompt switch calls maker.VerdaPlanPromptWithMode - verda is included in the read-only "output only" provider list so --maker (without --apply) prints the plan without trying to run it through the aws enrichment pipeline - --apply path: resolve verda credentials (backend-fallback-enabled) and call maker.ExecuteVerdaPlan with Client{ID,Secret} + ProjectID threaded through ExecOptions end-to-end: `clanker ask --maker --verda "create one h100 in FIN-01"` emits a json plan; `clanker ask --maker --verda --apply < plan.json` runs the plan's verda-api commands in order, respecting `produces` bindings and the --destroyer gate for delete/discontinue/force_shutdown actions.

nash added 23 commits April 20, 2026 01:40

add verda client with OAuth2 token caching

0275198

register verda command tree in root

804f3bf

wire verda into ask mode and routing keywords

f7d8dee

add verda-instant kubernetes cluster provider

85ab071

expose verda ask and list tools over mcp

11c2829

document verda block in clanker config example

c5d3a0e

update --maker help text to list verda and vercel

3220eda

document verda cloud support in readme

fabce37

rafeegnash merged commit ee0e660 into master Apr 20, 2026
5 checks passed

rafeegnash deleted the feat/verda-provider branch April 20, 2026 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: verda cloud (gpu/ai) provider support#142

feat: verda cloud (gpu/ai) provider support#142
rafeegnash merged 23 commits intomasterfrom
feat/verda-provider

rafeegnash commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rafeegnash commented Apr 19, 2026

Summary

Details

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant